Sequence length 4052 



CCAAGATTTAAAGCCCGCAAGTTTTGTTCTTC 
TTTGG AC ATTT AAAG AGC TGGGC TTG AACTC 

TGGCATGGAATATTCACATGGGAGAGCCGCATGAGGCCGCCCACCACGCTTCCTGAAGGATGCCCGTGTGGAAGAATTT 
TGACGTGCCAGTGTCCTCGTTCTACAGGGTGTTCCATTCTTCCGCAATCTCAGAAAAATGGGACTAAAAGAAACTATTT 
TGTAAAATAAGAAGACTTCCATTTTTAATGACCAACATGTATTAAGATGGACACCTACTCTACGAAACACGAAGTTCTA 

M R M L 4 

TGGTCTCGAAGAAGCCCGTGCCTGTTTAAAAC TGATCCTAACTAAAAACAGACTTGAGTGGAT ATG AGA ATG TTG 12 

VSGRRVKKWQLI IQLFATCF 24 

GTT AGT GGC AGA AGA GTC AAA AAA TGG CAG TTA ATT ATT CAG TTA TTT GCT ACT TGT TTT 72 

LASLMFFWEPIDNH1VSHMK 44 

TTA GCG AGC CTC ATG TTT TTT TGG GAA CCA ATC GAT AAT CAC ATT GTG AGC CAT ATG AAG 132 

SYSYRYLINSYDFVNDTLSL 64 

TCA TAT TCT TAC AGA TAC CTC ATA AAT AGC TAT GAC TTT GTG AAT GAT ACC CTG TCT CTT 192 

KHTSAGPRYQYLINHKEKCQ 84 

AAG CAC ACC TCA GCG GGG CCT CGC TAC CAA TAC TTG ATT AAC CAC AAG GAA AAG TGT CAA 2 52 

AQDVLLLLFVKTAPENYDRR 104 

GCT CAA GAC GTC CTC CTT TTA CTG TTT GTA AAA ACT GCT CCT GAA AAC TAT GAT CGA CGT 312 

SGIRRTWGNENYVRSQLNAN 124 

TCC GGA ATT AGA AGG ACG TGG GGC AAT GAA AAT TAT GTT CGG TCT CAG CTG AAT GCC AAC 3 72 

IKTLFALGTPNPLEGEELQR 144 

ATC AAA ACT CTG TTT GCC TTA GGA ACT CCT AAT CCA CTG GAG GGA GAA GAA CTA CAA AGA 43 2 

KLAWEDQRYNDI IQQDFVDS 164 

AAA CTG GCT TGG GAA GAT CAA AGG TAC AAT GAT ATA ATT CAG CAA GAC TTT GTT GAT TCT 492 

FYNLTLKLLMQFSWANTYCP 184 

TTC TAC AAT CTT ACT CTG AAA TTA CTT ATG CAG TTC AGT TGG GCA AAT ACC TAT TGT CCA 552 

HAKFLMTADDDIF IHMPNLI 204 

CAT GCC AAA TTT CTT ATG ACT GCT GAT GAT GAC ATA TTT ATT CAC ATG CCA AAT CTG ATT 612 

EYLQSLEQIGVQDFWIGRVH 224 

GAG TAC CTT CAA AGT TTA GAA CAA ATT GGT GTT CAA GAC TTT TGG ATT GGT CGT GTT CAT 67 2 

RGAPPI RDKSSKYYVSYEMY 244 

CGT GGT GCC CCT CCC ATT AGA GAT AAA AGC AGC AAA TAC TAC GTG TCC TAT GAA ATG TAC 732 

QWPAYPDYTAGAAYVI SGDV 264 

CAG TGG CCA GCT TAC CCT GAC TAC ACA GCC GGA GCT GCC TAT GTA ATC TCC GGT GAT GTA 7 92 

AAKVYEASQTLNSSLYIDDV 284 

GCT GCC AAA GTC TAT GAG GCA TCA CAG ACA CTA AAT TCA AGT CTT TAC ATA GAC GAT GTG 852 



FMGLCANKIGIVPQDHVFFS 304 
TTC ATG GGC CTC TGT GCC AAT AAA ATA GGG ATA GTA CCG CAG GAC CAT GTG TTT TTT TCT 912 

GEGKTPYHPC IYEKMMTSHG 324 
GGA GAG GGT AAA ACT CCT TAT CAT CCC TGC ATC TAT GAA AAA ATG ATG ACA TCT CAT GGA 972 

HLEDLQDLWKNATDPKVKTI 344 
CAC TTA GAA GAT CTC CAG GAC CTT TGG AAG AAT GCT ACA GAT CCT AAA GTA AAA ACC ATT 103 2 

SKGFFGQIYCRLMKI ILLCK 364 
TCC AAA GGT TTT TTT GGT CAA ATA TAC TGC AGA TTA ATG AAG ATA ATT CTC CTT TGT AAA 109 2 

ISYVDTYPCRAAFI* 379 

ATT AGC TAT GTG GAC ACA TAC CCT TGT AGG GCT GCG TTT ATC TAA 113 7 

TAGTACTTGAATGTTGTATGTTTTCACTGTCACTGAGTCAAACCTGGATGAAAAAAACCTTTAAATGTTCGTCTATACC 
CTAAGTAAAATGAGGACGAAAGACAAATATTTTGAAAGCCTAGTC 

AATATCACTTATCTACTTCATTGCCTAAGTTCATTTCAAAGAATTTGTATTTAGAAAAGGTTTATATTATTAGTGAAAA 
CAAAACTAAAGGGAAGTT(:AA(3TTCTCATGTAATGCCACATATATACTTGAGGTGTAGAGATGTTATTAAGAAGTTTTG 
ATGTTAGAATAATTGCTTTTG<^AAAATACCAAATGAACGTACAGTACAACATTTCAAGGAAATGAATATATTGTTAGAC 
CAGGTAAGCAAGTTTATTTTTGTTAAAGAGCACTTGGTGGAGGTAGTAGGGGCAGGGAAAGGTCAGCATAGGAGAGAAA 
GTTCATGAATCTGGTAAAJVCAGTCTCTTGTTCTTAAGAGGAGATGTAGAAAAATGTGTACAATGTTATTATAAACAGAC 
AAATCACGTCTTACCACATCCATGTAGCTACTGGTGTTAGAGTCATTAAAATACCTTTTTTTGCATCTTTTTTCAAAGT 

TTAATGTGAACTTTTAGAAAAGTGATTAATGTTGCCCTAATACTTTATATG 

GAAAATGACACATAACACGGGCAGCTGGTTGCTCATAGGGTCCTTCTCTAGGGAGAAAC 

TGATTTTAATGACGTTTTCAACTGGTTTTTAAATATTC 

ACATAGAGGAATATAATAATGGAGAGACTTCAAATGGAAAGACAGAACATTACAAGCCT 
AAATGAAATCTTAGTGTCTAAATCCTTGTACTGATTACTAAAATTAACCCACTCC 

C AGC AC TTTGTTC C AAGTTC AG AGTTTT AAATTG AG AGC ATT AAAC ATC AAAGTT AT AAT ATC T AAAAC AATTT ATTTT 
TCATCAATAACTGTCAGAGGTGATCTTTATTTTCTAAATATTTC 

TTGC C AGTTTGGGG TT AAAGC ATTTTT AAAGC TGC ATG TTC C TTGT AATC AAAG AGATGTGTC TGAG ATC TAAT AG AGT 
AAGTTACATTTATTTTACAAAGCAGGATAAAAATGTGGCTATAATACACACTACCTCCCT^ 

GTGGTGTC TACTGC TAGGGAGATTATATGAAGGCC AAAATAATGACTTC AGC AAGAGTGACTGAAC TC TC TAAGGC C 

TTTGACTGCAGAGGCACCTGTTAGGGAAAATCAGATGTCTCATATAA 

GAAAAAAGATTTCTCAGTATACACAACTGAATGATGATACTTACAATTTTTAG 

ATTTT AATTTTTTTC T ATTTTG AAATTTG AGGC TTGTTT AC ATTGC TTAGAT AATTT AG AATTTTTAAC T AATGTC AAA 

/B 






ACTACAGTGTCAAACATTCTAGGTTGTAGTTACTTTCAGAGTAGATACAGGGTTTTAGATC 

TGACCAATTAAAAAAACATAGAGAACAAAAGCATATTTGACCAAGC 

GATTAATGATGTATTGCCTTTTGCCCATATATACCCTGTC 

AAACATAAGTGTCTCTGGCCATCAAAGTGATCTTGTTTACAGCAGTGCTTTTGTGAAACAATTATTTATTTGCTGAAAG 
AGCTCTTC TG AAC TGTGTC CTTTT AATTTTTGC TT AG AAT AG AATGGAAC AAGTTT AAATTTC AAGG AAATATGAAGGC 
ACTTCCTTTTTTTCTAAGAAGGAAGTTGCTAGATGATTCCTTCATCACACTTACTTAAAGTACTGAGAAGAGTATCTGT 
AAATAAAAGGGTTCCAACCTTTTAAAAAAGAAGGAAAAAACT 

TGTCAACAAAGGGAAAATAAACTATCAGCTTGGATGGTCACTTGAATAGAAGATGGTTATACACAGTGTTATTGTTAAA 
ATTTTTTTACCTl^TGGTTGGTTTGCATCTTTTTTCCATATTGTT 

TTGAATTTTGCTCTTGTATGGCAAAATAATTAGTGAGTTTAAAAAAAATCTAl'AGTTTCCAATAAACAACTGAAAAATT 
AAAAAAAA 




Analysis of 8797 (378 aa) 



F'Fftfl 



no Htm hits Gal ac+osy I _T 



Cya I I ! I I III 

Ngly I 1 ! I 



Tr 



I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' I ' 

1 41 81 121 1G1 201 241 281 321 3G1 



>8797 

mrmlvsgrrwkwqliiqlfatcflaslmf™epidnhivshmksysyrylinsydfvn^ 
tlslkhtsagpryqyl.inhkekcqaqdvllllfvktapenydrrsgirrtwgnenyvrsq 
lnaniktlfalgtpnplegeelqrkxaweix}^ 

tycphakflmtadddifihmpnlieylqsleqigvqdfwigrvhrgappirdksskyyvs 
yemyqwpaypdytagaawisgdvaaicvteasqtlnsslyiddvfmglcankigivpqdh 
vffsgegktpyhpciyekmmtshghledlqdlwknatdpkvktiskgffgqi ycrl.mki i 
llc k i s yvdt y pc raa f i 



Protein Family / Domain Matches, HMMer version 2 

Searching for complete domains in PFAM 



II 



hmmpfam - search a single seq against HMM database 
HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 

HMMER is freely distributed under the GNU General Public License (GPL) . 

HMM file: /prod/ddm/seqanal/PFAM/pfam5 . 4/Pfam 

Sequence file: /prod/ddm/wspace/orf anal/oa-script . 19955 . seq 

Query: 8797 

Scores for sequence family classification (score includes all domains) : 
Model nescriDtion Score E-value 



Gal actosyl,? Galactosyl transferase 173.8 2.8e-48 

Parsed for domains: 

Model Domain seq-f seq-t hmm- f hmm-t score E-value 



Galactosyl„T 1/1 102 321 1 249 [] 173.8 2.8e-48 

Alignments of top-scoring domains: 

Galactosyl_T : domain 1 of 1, from 102 to 321: score 17 3.8, E = 2.8e-48 

*->arRnaiRkTWmnqnnsegvadgrikalFlvGl . sakgdqklkklvme 
+rR iR+TW+n+n++++ ++ ik+lF + G + ++++++ ++1+ + + + 
87 9 7 102 DRRSGIRRTWGNENYVRSQLNANIKTLFALGTpNPLEGEELQRKLAW 14 8 

EakrtlyGDiiwDleDsYenLtlKTltillygvskcpsakligKiDdDv 
E++ y Dii++D+ Ds++nLtlK 1+ +++++++cp+ak+ + DdD+ 
8797 149 EDQ--RYNDIIQQDFVDSFYNLTLKLLMQFSWANTYCPHAKFLMTADDDI 196 

f vnpdkLlslLereniridpsessf yGyiikegepvrrkkskrdWYvppt 
f+ +++L+++L+ i ++++++ G+++++ +p+r k sk Yv+++ 

8797 197 FIHMPNLIEYLQSL-EQIGVQDFWI-GRVHRGAPPIRDKSSK — YYVSYE 242 

eYpcsrNgnkYPpYvsGpfYllsrdAAplIlkaskhrLr . f IkiEDVliT 
Y + YP Y +G Y++s+d+A ++++as + ++ 1 i+DV++ 

8797 243 MYQWPA YPD YTAGAAYVI SGDVAAKVYEASQTL - Ns S L Y I DDVFM - 286 

Gilaedlglsrinlprlsistnlf rf hhsqkdndgcdvf awhtahkndpe 
G +a+++gl + + + + f+ + + + + h+ + +e 

8797 287 GLCANKIGIVPQDH VFFSGEGKTPY HPCIYE 317 

ylif <-* 
+ + + 

8797 318 KMMT 321 



Transmembrane Segments Predicted by MEMS AT 
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>8797 

MRMLVSGRRVKKWQLI IQLFATCFI^SLMFFWEPIDNHIVSHMKSYSYRYLINSYDFVND 

TLSLKHTSAGPRYQYTrTNHKEKCQAQDVLLLLFW 

LNANIKTLFAIX5TPNPLEGEELQRKIAW^^ 

TYCPHAKFI^ADDDIFIHMPNLIEYLQSLEQIGVQDFWIGRVHRGAPPIRDKSSKYYVS 
YEMYQWPAY PD YTAGAA YVI SGDVAAKVYEA S QTLNS SLYI DDVFMGLC ANK I G I VPQDH 

VFFSGEGKTPYHPCIYEKMMTSHGHLEDLQDLV^^ATDPKVKTISKGFFGQIYCRLMKII 
LLCKISYVDTYPCRAAFI 
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